Varieties of Noisy Harmonic Grammar
نویسنده
چکیده
Noisy Harmonic Grammar (NHG) is a framework for stochastic grammars that uses the GEN-cum-EVAL system originated in Optimality Theory. As a form of Harmonic Grammar, NHG outputs as winner the candidate with the smallest harmonic penalty (weighted sum of constraint violations). It is stochastic because at each “evaluation time,” constraint weights are nudged upward or downward by a random amount, resulting in a particular probability distribution over candidates. This “classical” form of NHG can be modified in various ways, creating alternative theories. I explore these variants in a variety of simple simulations intended to reveal key differences in their behavior; maxent grammars are also included in the comparison. In conclusion I offer hints from the empirical world regarding which of these rival theories might be correct. 1 Thanks to Kie Zuraw for substantial advice and assistance, and to Adam Albright, Joe Pater, Brian Smith, and talk audiences at the UCLA Phonology Seminar and AMP for their helpful feedback. Bruce Hayes Varieties of Noisy Harmonic Grammar p. 2 1. Background: Stochastic constraint-based grammar frameworks in modern linguistics The key innovation of Optimality Theory (Prince and Smolensky 1993) was its GEN-plusEVAL architecture: GEN enumerates candidates, and EVAL, consisting of a set of constraints, selects the winning candidate from GEN as the output. This conception naturally led to the idea of a stochastic version of the theory, in which EVAL outputs not one single candidate but rather a probability distribution over GEN. Such a framework would provide a natural account of free variation and related phenomena: we get multiple outputs when the conflict between constraints is not completely resolved. Early such frameworks included the Floating Constraint model (Nagy and Reynolds 1997), Partial Ordering OT (Anttila 1997) and Stochastic OT (Boersma 1997), and with time still other approaches were put forward. The pursuit of such frameworks is empirically well-motivated because gradience in language is so widespread (see, e.g. Bod et al. 2003, Fanselow et al. 2006). Everywhere we look, if we look carefully, we find free variation in the outputs of grammar and gradience in wellformedness intuitions. The work of Zuraw (2000, 2010) opened up a new domain of gradience, quantitative phonological patterns in the lexicon, often perceived accurately by language learners, with the ambient probability distributions accurately reproduced by them under experimental probing. Empirical work continues to support this “Law of Frequency Matching” (Hayes et al. 2009) as a baseline prediction for human phonological learning. All of these phenomena need an appropriate framework for formal analysis. 2 The earlier phases of research on stochastic grammar frameworks were tinged with a practical orientation: it was felt to be important simply to establish that the frameworks really could work in nontrivial cases, e.g. by Boersma and Hayes (2001:46). It was also felt to be important to assess the learning algorithms offered in tandem with new stochastic frameworks. Such research is exemplified by the counterexample Pater (2008) discovered to the Gradual Learning Algorithm for Stochastic OT; or Goldwater and Johnson’s (2003) pointing out the sensibleness of using maxent based on its strong and reliable mathematical foundations. Soon, however, purely scientific goals entered the debate: different stochastic frameworks treat the same gradient data in different ways and make different empirical predictions, so that the choice of framework is really part of linguistic theory. Early on, Jesney (2007) demonstrated the substantial differences between Noisy Harmonic Grammar and maxent grammars in their treatment of consonant deletion for a CVCC input. Similarly, Boersma and Pater (2008/2016:410-412) compared versions of Harmonic Grammar in which constraint weights are either allowed or not allowed to go below zero. This article is intended as a contribution to this line of research: I put forth some cases in which different varieties of Noisy Harmonic Grammar, as well as maximum entropy grammars, make different predictions. 2 My sense is that there remains some inertia among linguists in the recognition of variation as an empirical reality. I conjecture that this may be the consequence of traditional research methodologies of the field. It is still often the case that fieldworker and theorist never meet, and the latter takes the former’s first-pass best estimate as the analytical target. Different results are often obtained if the theorist has access to corpora, or performs experiments. Another factor may be the widespread use of problem sets (e.g. in first-year graduate training) in which the data have been intentionally cleansed of variation. Bruce Hayes Varieties of Noisy Harmonic Grammar p. 3 2. Background on the frameworks examined 2.1 Noisy Harmonic Grammar Noisy Harmonic Grammar (hereafter “NHG”) is stochasticized Harmonic Grammar. Harmonic Grammar, which has an ancestry that predates OT (Legendre et al. 1990, Legendre et al. 2006, Pater 2008/2016, Potts et al. 2010,), uses the same GEN-cum-EVAL architecture, but instead of ranking the constraints, it assigns them numerical weights reflecting their relative strength. In the nonstochastic version of the framework, winning candidates are selected as follows. For each candidate’s row in the tableau, one multiplies violation counts by corresponding constraint weights and add up the total across constraints. This yields the harmony, a kind of penalty score. The winning candidate is the least penalized one; i.e. the one with the lowest harmony. In (1), we first see an OT-like tableau, with two constraints and two candidates, along with the weights of the constraints. Tableau (1b) shows how the basic computations are carried out, and it can be seen that Candidate 1, with the lesser harmony penalty, emerges as the winner. (1) Illustration of nonstochastic Harmonic Grammar a. /Input/ CONSTRAINT1 CONSTRAINT 2 weights: 2 1 Candidate 1 ** Candidate 2 * * b. /Input/ CONSTRAINT1 CONSTRAINT 2 Harmony weights: 2 1 Candidate 1 ** × 1 = 2 0 + 2 = 2 Candidate 2 * × 2 = 2 * × 1 = 1 2 + 1 = 3 In what follows I will discuss stochastic versions of this theory. 2.2 Classical Noisy Harmonic Grammar In what I will call Classical Noisy Harmonic Grammar, put forth by Boersma and Pater (2008/2016), Harmonic Grammar is stochasticized as follows. At each “evaluation time” (moment of application of the grammar), every constraint weight is separately perturbed upward or downward by a random amount drawn from a Gaussian distribution with mean 0 and a uniform standard deviation. This perturbation factor is called the noise; I will here employ a noise value of 1. Because of noise, on different evaluation times, there can often can be different winners, depending on what random noise values happen to have been assigned. For example, in (2) noise values (designated here as Nn) are added to each weight, influencing the computation of harmony. Candidate 2 is penalized by a greater “base value” of 3 Different scholars use different versions of Harmony, varying only in a negative sign (for some Harmony is a negative quantity). The nomenclature/choice of signs followed here is taken from Wilson (2006). Bruce Hayes Varieties of Noisy Harmonic Grammar p. 4 harmony (3 vs. 2), making it mostly likely that Candidate 1 will win. But if the noise values happen to be such that N2 exceeds N1 by more than one, then Candidate 2 will win instead. The math when worked out shows that Candidate 2 will in fact win 24.0% of the time. The frequency pattern is shown in (2) informally with pointing fingers of different sizes. (2) Illustration of Classical Noisy Harmonic Grammar /Input/ CONSTRAINT1 CONSTRAINT 2 Harmony weight: 2 + N1 1 + N2 Candidate 1 ** × (1 + N2) 2 + 2N2 Candidate 2 * × (2 + N1) * × (1 + N2) 3 + N1 + N2 The probability of 24.0% can be calculated, for instance, by performing various integrations over Gaussian distributions, a procedure that becomes laborious for all but simple cases (see Zuraw 2000:105-113). A convenient alternative is to use the Monte Carlo method: one runs the same grammar, say, 100,000 times, counting up winners, and the values obtained, divided by 100,000, serve as a reasonable estimate of the probabilities generated by the grammar. The procedure established in Classical Noisy Harmonic Grammar represents just one choice from a whole set of logical possibilities for introducing noise into Harmonic Grammar. I conjecture that this choice was carried over from the pioneering first theory that used noise to stochasticize a constraint-based framework, namely Boersma’s (1997) Stochastic OT. Just as Stochastic OT adds noise to the “ranking values” of constraints and then uses the result to pick a winner by the rules of nonstochastic OT, so Classical NHG adds noise directly to the constraint weights and picks a winner by the rules of nonstochastic Harmonic Grammar. The alternatives to Classical NHG that I will consider here derive from just where in the system one “adds in the noise”. I will explore variants using cell-specific noise (§2.3) as well as variants that add the noise in late, e.g. after the multiplication of weights by violation counts (§3.3). 2.3 NHG with cell-specific noise Goldrick and Daland (2009) suggest that we can profitably explore more fine-grained assignments of noise than in Classical NHG. Their proposal is actually made in a connectionist implementation of Harmonic Grammar that has other implications as well (for instance, different weights for + − Faithfulness mappings than − +, so I will here follow what is in context a more modest change in Classical NHG. Rather than perturbing the original constraint weights, we instead install a fresh noise value for every cell. For now, we assume that cells with no violations are not given a noise value, though this assumption will be revised later on. Here is an example: Bruce Hayes Varieties of Noisy Harmonic Grammar p. 5 (3) Illustration of Noisy Harmonic Grammar with cell-specific noise /Input/ CONSTRAINT1 CONSTRAINT 2 Harmony weight: 2 1 Candidate 1 * × (1 + N2) 2 + 2N2 Candidate 2 * × (2 + N1) * × (1 + N3) 3 + N1 + N3 Let us compare the harmony values from the tableaux of (2) and (3): the Classical theory assigns Candidate 2 a perturbed harmony of 3 + N1 + N2 whereas the theory with cell-specific noise assigns it a perturbed harmony of 3 + N1 + N3; for both theories the harmony of Candidate 1 is 2 + 2N2. We can think of (3) as displaying “cell-granularity” and (2) “constraintgranularity”. This makes differing predictions, as will be illustrated shortly.
منابع مشابه
Stochastic Harmonic Grammars as Random Utility Models
There are a variety of ways of building stochastic grammars based on Harmonic Grammar (Hayes 2017). A basic division is often drawn between ‘Maximum Entropy’ grammars and Noisy Harmonic Grammars, which are superficially quite different in form. However both can be formulated as Random Utility Models, which are widely used in economics to model choice among discrete alternatives (Train 2009). Ex...
متن کاملAutomatic Functional Harmonic Analysis
Music scholars have been studying tonal harmony intensively for centuries, yielding numerous theories and models. Unfortunately, a large number of these theories are formulated in a rather informal fashion and lack mathematical precision. In this article we present HarmTrace, a functional model of Western tonal harmony that builds on well-known theories of tonal harmony. In contrast to other ap...
متن کاملA Parser for Harmonic Context-Free Grammars
Harmonic Grammar is a connectionist-derived grammar formalism, of which Optimality Theory is a kind of limiting case. Harmonic Grammar is expressive enough to specify the trees that are correct parses on a given contextfree grammar. Here, we show how to construct a connectionist parsing network which finds correct parses given a sentence, of if none exist, signals a rejection. Finally, a brief ...
متن کاملTowards a generative syntax of tonal harmony
This paper aims to propose a hierarchical, generative account of diatonic harmonic progressions and suggest a set of phrase-structure grammar rules. It argues that the structure of harmonic progressions exceeds the simplicity of the Markovian transition tables and proposes a set of rules to account for harmonic progressions with respect to key structure, functional and scale degree features as ...
متن کاملA generative grammar approach to diatonic harmonic structure
This paper aims to give a hierarchical, generative account of diatonic harmony progressions and proposes a generative phrase-structure grammar. The formalism accounts for structural properties of key, functional, scale and surface level. Being related to linguistic approaches in generative syntax and to the hierarchical account of tonality in the generative theory of tonal music (GTTM) [1], cad...
متن کامل